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Abstract 

The noise of a device under test (DUT) is measured simultaneously 
with two instruments, each of which contributes its own background. The 
average cross power spectral density converges to the DUT power spectral 
density. This method enables the extraction of the DUT noise spectrum, 
even if it is significantly lower than the background. After a snapshot 
on practical experiments, we go through the statistical theory and the 
choice of the estimator. A few experimental techniques are described, with 
reference to phase noise and amplitude noise in RF/microwave systems 
and in photonic systems. The set of applications of this method is wide. 
The final section gives a short panorama on radio-astronomy, radiometry, 
quantum optics, thermometry (fundamental and applied), semiconductor 
technology, metallurgy, etc. 

This report is intended as a tutorial, as opposed to a report on ad- 
vanced research, yet addressed to a broad readership: technicians, prac- 
titioners, Ph.D. students, academics, and full-time scientists. 
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Notation 



Symbol Meaning 

a{t) ^ A{f) background noise of the instrument A 

b{t) -H> B{f ) background noise of the instrument B 

c{t) o C{f) DUT noise, i.e., the useful signal 

bi coefficients of the power-law approximation of Sip{f) 

(in AM-PM noise) 

dev{ } deviation, dev{x} = ^V{a;} 

E{ } mathematical expectation 

/ Fourier frequency, Hz 

f{x) probability density function (PDF) 

F{x) cumulative density function (CDF) 

T{ } Fourier thansform operator 

hi coefficients of the power-law model of Sa{f) or Sy{f) 

(in AM-PM noise) 

i integer number, often as as an index 

z imaginary unit, = —1 

Q{ } imaginary part of a complex quantity, as in X" — 5{A'} 

m number of averaged spectra, as in {\Syx\)^ 

0{ ) order of, as in = 1 -I- x 0{x^) 

P{ } probability as in P{a; > 0} 

Pn probability that a value is negative, as in Pjy = ¥{x < 0} 

Pp probability that a value is positive, as in Pp = ¥{x > 0} 

Rxx{t') autocorrelation function 

3?{ } real part of a complex quantity, as in X' = 5R{Ar} 

Sxxif) PSD of the quantity x 

Syx{f ) cross PSD of the quantities y and x 

t time 

T measurement time 

V{ } variance, mathematical expectation of 

x{t) ^ X{f) generic variable 

x{t) o X{f) signal at the FFT analyzer input, channel 1 

x(i), y{t) stochastic processes, of which x{t) and x{t) are realizations 

y{t) o Y{f ) generic variable 

y{t) o Y{f) signal at the FFT analyzer input, channel 2 

a{t) o A{f) normalized-amplitude noise (in AM-PM noise) 

T{x) the gamma function used in probability 

PSD of the signal c{t) 

fj, average (the value of) 

I' frequency (Hz), used for carrier signals (in AM-PM noise) 

v no. of degrees of freedom, in probability functions 
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T 

ifit) ^ $(/) 


Allan deviation, V Allan variance (in AA'I-PM noise) 
measurement time of the Allan variance (in AM-PM noise) 
phase noise (in AM-PM noise) 

in probability, = xf + X2 + X3 + . . . originates the 
distribution 


Subscript 


Meaning 


T 


truncated over the meas. time T, as in xrit), Xxif) 


Superscript 


Meaning 


* 


complex conjugate, as in = XX* 


Symbol 


Meaning 



( ) average. Also ( )^ average of m values 



estimator of a quantity, as in Syx = (Syx)^ 
', " real and imaginary part, as in X = X' + iX" 

■H- transform inverse-transform pair, as in x{t) O X{s) 

time-derivative, as in (p{t) (in AM-PM noise) 

Acronym Meaning 

AM Amplitude Modulation, often 'AM noise' (in AM-PM noise) 

CDF Cumulative Density Function 

DUT Device Under Test 

FFT Fast Fourier Transform 

PM Phase Modulation, often 'PM noise' (in AM-PM noise) 

PDF Probability Density Function 

PLL Phase Locked Loop (in AM-PM noise) 

PSD (single-side) Power Spectral Density 

font / case Meaning 

uppercase Fourier transform of the lower-case function 

rm-bf stochastic processes, as in x{t) is a realization of x(f) 

Font/case is used in this way only in some special (and obvious) cases 
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1 Introduction 

Measuring a device under test (DUT), the observed spectrum contains the DUT 
noise, which we can call signal because it is the object of the measurement, 
and the background noise of the instrument. The core of the cross-spectrum 
measurement method is that we can measure the DUT simultaneously with two 
equal instruments. Provided that experimental skill and a pinch of good luck 
guarantee that DUT and instruments are statistically independent, statistics 
enables to extract the DUT spectrum from the background. 

The two-channel measurement can be modeled as the block diagram of Fig.jl] 
where a{t) and b{t) are the background of the two instruments, and c{t) the DUT 
noise, under the hypothesis that a{t), h{t) and c(t) are statistically independent. 
Thus, the observed signals are 

x{t) = c{t) + a{t) 
yit) = c{t) + bit) . 

We are interested in the power spectral densitjj^ (PSD), which is a normalized 
form of spectrum that expresses the power per unit of bandwidth, denoted with 
S{f). It will be shown that the average cross-PSD {Syx{f)) converges to the 
DUT PSD Sccif), which is what we want to measure. 

The idea of the cross-spectrum method is explained in Fig. [2] This figure 
builds from the output of the free- running analyzer, after selecting one frequency 
(/o). This is a sequence of \Syx{fo)\ called realizations, which we average on 
contiguous groups of m values | (Syxif))^ \ - The averages form a (slower) se- 
quence whose statistical properties depend on m. So, Fig. [2] plots the average 
and the variance of the sequence of averages, as a function of m. At small val- 
ues of m, the background is dominant and decreases as m increases. Beyond 
TO « 100, we observe that | {Syx{f))„^ \ stops decreasing and approaches the 
value of 0.1 (—10 dB), which is the DUT noise in this example. The standard 
deviation further decreases. The background is dominant below m sa 100. Be- 
yond, the DUT noise shows up and the estimation accuracy increases, as seen 
from the deviation-to-average ratio. Notice that the choice of | {Syx{f))^ \ as 
an estimator of Syx{f) is still arbitrary and will be further discussed. 

All this report is about how and why the cross-spectrum converges to the 

^The PSD as a statistical concept will be defined afterwards. Newcomers can provisionally 
use Syxif) = ^y{f)X* if), which is the is the readout of the FFT analyzer. T is the 
measurement time. 
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Figure 1: Basics of the cross-spectrum method. 
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Figure 2: Average and deviation of the cross spectrum | {Syx)^ |, as a function 
of the number m of averaged realizations of white Gaussian noise. Since the 
statistical properties of Syx{f) are the same at any frequency, only one point 
(i.e., one frequency) is shown and the variable / is dropped. The DUT noise is 
10 dB lower than the background. 



DUT noise Sccif), and about how this fact can be used in the laboratory prac- 
tice. The scheme of Fig. [TJis analyzed from the following standpoints 

Normal use. All the noise processes [a{t), b{t) and c{t)] have non-negligible 
power. We use the statistics to extract Scdf)- 

Statistical limit. In the absence of correlated phenomenon, thus with c = 0, 
the average cross spectrum takes a finite nonzero value, limited by the 
number of averaged realizations. 

Hardware limit. After removing the DUT, a (small) correlated part remain. 
This phenomenon, due to crosstalk or to other effects, limits the instru- 
ment sensitivity. 

Though the author is inclined to use phase and amplitude noise as the fa- 



vorite examples (Section 8.1 and 8.2 ), the cross-spectrum method is of far more 
general interest. Examples from a variety of research fields will be discussed in 
Section [831 

As a complement to this report, the reader is encouraged to refer to classi- 
cal textbooks of probability and statistics, among which |Fell2| |Pap92[ ICra461 
IDR58) are preferred. 



2 Power spectral density 

The processes we describe are stationary and ergodic. The requirement that 
noise be stationary and ergodic is not a stringent constraint in the laboratory 
practice because the words 'stationary' and 'ergodic' are the equivalent of 're- 
peatable' and 'reproducible' in experimental physics. Thus, a realization x{t) 
has the same statistical properties independently of the origin of time, and also 
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the statistical properties of the entire process x(t). Unless otherwise specified, 
x{t) is a zero-mean finite-power process. The power spectral density (PSD) of 
such processes is 

(1) 

where } is the Fourier transform operator, 

i?,,(t') =E{x(t)x(t + i')} (2) 

the autocorrelation function, and E{ } the mathematical expectation. 

As a simplified notation, we use the upper case for the Fourier transform, 
and the left-right arrow for the transform inverse-transform pair, thus 

x{t) -s-> X(f) Fourier transform - inverse transform pair . 

The two-sided Fourier transform and spectra are generally preferred in theoreti- 
cal issues, while the experimentalist often prefers the single-sided representation. 
Though we use the one-sided representation in all figures, often we do not need 
the distinction between one-sided and two-sided representation. In most prac- 
tical measurements the Fast Fourier Transform (FFT) replaces the traditional 
Fourier transform, and the frequency is a discrete variable. 

The Wiener-Khintchine theorem for ergodic and stationary processes enables 
to calculate the PSD through the absolute value of the Fourier transform. Thus 
it holds that 

E{SUf)} = e{ lim } (3) 

= E{^lim[l|W)r]}, (4) 



where the subscript T means truncated over the measurement time T, and 
the superscript stands for complex conjugate. By the way, the factor ^ is 
necessary for Sxxif) to have the physical dimension of a power density, i.e., 
power per unit of frequency. 

Omitting the expectation, ([3| can be seen as a realization of the PSD. In 
actual experiments the expectation is replaced with the average on a suitable 
number m of spectrum samples 

{SxAf))rn = ^ (I^t(/)|')„ (avg, m spectra) . (5) 

As an obvious extension, the cross PSD of two generic random processes 
x(t) and y(t) 

Syx{f)=J^{Ryx{t')} (6) 



is measured as 
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2.1 Measurement time T 

In practical experiments the measurement time is finite, so we can only access 
the truncated version xxit) O Xxif) of a realization. In order to simplify the 
notation, the subscript T for the truncation time will be omitted. Thus for 
example we write ([?]) as 

{SyM))^ = ^ {Y{f) X*{f)).^ (abridged notation) . 

2.2 Why white Gaussian noise 

However too simplistic at first sight it may seem, the use of white Gaussian 
noise is justified as follows. First, spectrally-smooth noise phenomena originate 
from large-number statistics (electrons and holes, semiconductor defects, shot 
noise, etc.), which by virtue of the central limit theorem yield to Gaussian 
process. Second, most non-white noise phenomena of interest in follow the 
power-law model S{f) — J^^if^' hence they can be converted into white noise 
after multiplication by a suitable power of / without affecting the PDF, and 
converted back after analysis. The idea of whitening and un-whitening a noise 
spectrum is by the way of far broader usefulness than shown here. For these 
reasons we can take full benefit from the simplicity of white Gaussian noise. 
Yet, it is understood that white noise rolls off at some point, so that all signals 
have finite power. 



3 The cross-spectrum method 

Recalling the definitions of Section [I] we denote with a{t) and b{t) the back- 
ground of the two instruments, with c{t) the common noise, and with A, B and 
C their Fourier transform, letting the frequency implied. Working with real- 
izations, we no longer need a separate notation for the process. By definition, 
a(t), b{t) and c{t) are statistically independent. We also assume that they are 
ergodic and stationary. The two instrument outputs are 

x{t) = c(t) + a{t) ^ X = C + A (8) 
y{t) = c(t) + h{t) ^ Y = C + B. (9) 

First, we observe that the cross-spectrum Syx converges to Sec- In fact, 

E{Sy,} = ^E{YX*} 

^ ^E{[C + A] X [C + B]*} 

^ ^ [E{CC*} + E{CB*} + E{AC*} + E{AB*}] 

= Sec (10) 

because the hypothesis of statistical independence gives 



E{CB*} = 0, E{AC*} = 0, and E{AB*} = 
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Then we replace the expectation with the average on m measured spectra 



O(yiAn), (11) 

where 0{ ) means 'order of.' Owing to statistical independence, the cross terms 
decrease proportionally to Xj^pm. 



3.1 Statistical limit 

With no DUT noise it holds that c = 0, hence Sec = 0. Maintaining the 
hypothesis of statistical independence of the two channels, we notice that the 
number of averaged spectra sets a statistical limit to the measurement. Only 



the cross terms remain in (11), which decrease proportionally to Thus, 
the statistical limit is 



(^y-)™ - T {AB*)„, {Syy)„^ (^,,),„ (statistical limit). (12) 



Accordingly, a 5 dB improvement on the single-channel noise costs a factor of 10 
in averaging, thus in measurement time. The convergence law will be extensively 
discussed afterwards. 



3.2 Hardware limit 

Breaking the hypothesis of the statistical independence of the two channels, we 
are interested in the correlated noise of the instrument, which limits the sensi- 
tivity. This can be due for example to the crosstalk between the two channels, 
or to environmental fluctuations (ac magnetic fields, temperature, etc.) acting 
simultaneously on the two channels. The mathematical description is simpli- 
fied by setting the true DUT noise to zero, and by re-interpreting c(t) as the 
correlated noise of the instrument observed on unlimited number of averaged 
spectra 

K{Sy^} = E{Scc} (hardware limit) . (13) 

Nonetheless, the correct identification of this limit may require non-trivial ex- 
perimental skill. 

3.3 Regular DUT measurement 

The accurate measurement of a regular DUT requires that 

1. The number m is large enough for the statistical limit to be negligible 

2. The hardware background noise is negligible as compared to the DUT 
noise 

In this conditions, the average cross spectrum converges to the expectation of 
the DUT noise 

{Syx)m E{Scc} (DUT measurement). (14) 

This is the regular use of the instrument. 
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4 Running the experiment 

Before getting through mathematical details, it is instructive to start from a 
simplified picture of what happens when we run an experiment. For this pur- 
pose, we chose Sy^ = \ (Syx)^ \ as an estimator of Sy^, which is often the default 
of the FFT analyzer in cross-spectrum mode. This estimator is suitable to be 
displayed on a logarithmic scale (dB) because it takes only nonnegative values, 
but it is biased. We observe the PSD on the display of the FFT analyzer as m 
increases, looking for the signature of Syx converging to Sec- 

We restrict our attention to the case of DUT noise smaller than the single- 
channel background, as it usually occurs when we need the correlation. The 
purpose for this assumption is to make the simulations representative of the 
laboratory practice. And of course we assume that the two channels are equal. 

4.1 Ergodicity 

Averaging on m realizations, the progression of a measurement gives a sequence 
of spectra | {Syx)^ \i of running index i, as shown in Fig.^ For a given frequency 
/o, the sequence | {Syx{fo))^ \i is a time series. Since Syx{fi) and Syx{f2), are 
statistically independent for /i ^ /2, also | {Syx{fi))^ \i and | {Syx{f2))^ \i are 
statistically independent. For this reason, scanning the frequency axis gives 
access to (a subset of) the statistical ensemble. 

Ergodicity allows to interchange time statistics and ensemble statistics, thus 
the running index i of the sequence and the frequency /. The important con- 
sequence is that the average and the deviation calculated on the frequency axis 
give access to the average and deviation of the time series, without waiting for 
multiple realizations to be available. This property helps detect when the cross 
spectrum leaves the l/y/m law and converges to the DUT noise. 

Figure |4] shows a sequence of cross spectra | {Syx)^ |, increasing m in powers 
of two. On the left-hand side of Fig. |4| the DUT noise is set to zero. Increasing 
TO, the average cross spectrum decreases proportionally to 1/ ^/m, as emphasized 
by the slanted plane. The law is easily seen after averaging on the 

frequency axis separately for each value of m, and then transposing the law 
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to each point of the frequency axis thanks to ergodicity. The right-hand side 
of Fig. [4] shows the same simulation, yet with the DUT noise set to a value 
of 10 dB lower than the single-channel background. At small values of m the 
cross-spectrum is substantially equal to the previous case. Yet at m > 100 the 
cross-spectrum leaves the l/-y/m law (slanted plane) and converges to the DUT 
noise (horizontal plane at —10 dB). Once again, thanks to ergodicity we can 
transpose the average on the frequency axis to each point of the frequency axis. 

In the rest of this Section we will refer to a generic point of the PSD, letting 
the frequency unspecified. The variable / is omitted in order to simplify the 
notation. Hence for example we will write ^{Syx} instead of ^{Syx{f)}- 

4.2 Single-channel noise. 

It is explained in Sec. [5] that the single-channel PSD {Sxx)„i distributed 
with 2m degrees of freedom. The average PSD is equal to ^ Y{X} = ^Y{A} + 
^Y{C}, where V{ } is the variance; the deviation-to-average ratio is equal to 
l/^/m. Of course the same holds for Syy, after replacing A with B. 

The track seen on the display converges to the DUT noise plus the back- 
ground noise, and shrinks as m increases. The track thickness is twice the 
deviation. This fact is shown on Fig. [5] The green plot, labeled \Sxx\, keeps the 
same vertical position as m increases, and shrinks. 

4.3 Cross-spectrum observed with insufficient m. 

When the number m of averaged realizations is insufRcient for the DUT noise 
to show up, the system behaves as the two channels were (almost) statistically 
independent. In this conditions we can predict the spectrum by setting X ~ A, 
y ~ S and C ~ 0, thus E{Syx} ^ 0. 

The estimator Syx — \ {Syx)^ \ has Rayleigh distribution with 2m degrees of 
freedom. Normalizing on the single-channel background EjS'aja;} = E{Syy} — 1, 
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Figure 5: Simulated PSD, plotted for increasing number m of averaged realiza- 
tions. The parameter g = 0.32 (—10 dB), which is k in the main text, is the 
correlated noise, while the single-channel background is of one. 
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and using the results of Sec. [6] we find that 



and therefore 



1 / 7r\ 0.463 



^^li^ = W i - 1 = 0.523 (independent of m) . 
E{Sy^} Vvr 

The track is centered at '^'^ . This is the estimator bias. The track looks as a 
horizontal band located at avg ± dev, thus on a logarithmic from 101og]^Q(l — 
dev/avg) = —3.21 dB to 101og;^g(l + dev /avg) = +1.83 dB asymmetrically 
distributed around the average. This is shown on Fig. [5] For m < 100, the 
blue plot labeled \Syx\ decreases proportionally to 1/y/m and has the constant 
thickness of half a decade (5 dB), independent of to. 



4.4 Cross-spectrum observed with large m. 

When the number to of averaged realizations is large enough, the background 
noise vanishes and the DUT spectrum shows up. The cross spectrum no longer 
decreases but the variance still does. Qualitatively speaking, the average is set 
by the DUT noise Sec and the deviation is set by the instrument background 
divided by -^/m. On a logarithmic scale, the track no longer decreases and starts 
shrinking. This is shown on Fig. [s] for m > 100, blue plot labeled \Syx\- 

The above reasoning can be reversed. The simultaneous observation that 
the cross spectrum stops decreasing, and shrinks is the signature that the aver- 
aging process is converging. The single-channel background is rejected and the 
instrument measures the DUT noise (or the hardware limit, which is higher). 
This fact is of paramount importance in some measurements, where for some 
reasons we cannot remove the DUT. 



5 Estimation of Sxx 

The measurement accuracy depends on three main factors, instrument cali- 
bration, instrument background (front-end and quantization), and statistical 
estimation. Only the latter is analyzed in this Section. 

As a property of zero-mean white Gaussian noise, the Fourier transform 
X = X' + iX" is also zero-mean Gaussian, and the energy is equally split 
between X' and X" . Restricting our attention to a generic point (i.e., to an 
unspecified frequency), the PSD is 
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For use in this Section we define 

which is the power in 1 Hz bandwidth. Since X' and X" are zero-mean 
Gaussian-distributed random variables, a single realization 

5.. = ^ [X'^+X"'] 

follows a distribution with two degrees of freedom. After our definition of 
"j^, we find that 

Y{X'} = Y{X"} = ^ ■ 

because S^^ includes a factor ^. This is seen on the "scaled x^" column of 
Table pi after setting 1^ — 2 (degrees of freedom) and a = \T q^. On that Table 
we find that EjS'j^a;} — ^ va^ , which is equal to <;^, and that V{S':e2;} = 2i/ct'', 
hence 

Averaging on m realizations of Sxx 

1 ™ 1 



X. 



»21 



we notice that {Sxx)m has distribution with 2m degrees of freedom. Using 
the right-hand column of Table [2j we find Y{{Sxx)m} ~ m''^- '^^^ uncertainty 
(standard deviation) is therefore 

Vm E{{Sxx), nS V™ 

Figure [6] shows an example PDF of the spectrum averaged on m realizations. 
The distribution is normalized for the standard deviation to be equal one. 
Increasing m, the PDF converges to the normal distribution and shrinks. 
Finally, we may find useful the following normalization 

Saa = 1 (background) Sec = (DUT) . 

Expanding X = X'+iX" = {A' +C')+i{A" + C") we notice that X is zero-mean 
white Gaussian noise, and that 

E{{Sxx)„A = 1 + dev{(5,,)„} ^ 1 + 



m 



6 Estimation of Syx and noise rejection 

It is obvious from Eq. ([5| that the spectrum Sxxif) takes always real positive 
values, even if averaged on a small number of realizations. Since some kind of 
fundamental noise is always present in a physical experiment, the probability 
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Figure 6: Probability density function f{x) of the PSD averaged on m realiza- 
tions. 



that Sxx{f) nulls at some frequency is zero. Conversely, the cross-spectrum 
Syx{f) is a complex function that converges to the positive function Sccif) only 
after averaging on a sufficient number m of realizations, as seen in Eq. ( 11 1. 

In numerous practical cases we need to plot Syx{f) on a logarithmic vertical 
scale, which is of course impossible where Syx{f) is not positive. 

• In radio engineering virtually all spectra are given in decibels, which re- 
sorts to a logarithmic scale. 

• When the spectrum spreads over a large dynamic range, only a compressed 
scale makes sense. The logarithmic scale is by far the preferred represen- 
tation. 

• Numerous spectra found in physical experiments follow a polynomial law 
because the time-domain derivative (integral) maps into a multiplication 
(division) of the spectrum by On a logarithmic plot, a power of / 
maps into a straight line. 

• It is explained in Section [4] that running the experiment, average and 
deviation of the instrument noise are ruled by the same 1/sqrtm law until 
the number of averaged realizations is sufficient for Syx{f) to converge to 
Sccif)- This is most comfortably seen on a logarithmic scale. 

Thus, we need to extend Section[5]to the cross spectrum, discussing the suitable 
estimators. The estimator may introduce noise and bias. In everyday life a 
better estimator may save only a little amount of time, and in this case it 
could be appreciated mainly because it is smarter. Oppositely in long-term 
measurements, like timekeeping and radioastronomy, a single data point takes 
years of observation. Here, the choice of the estimator may determine whether 
the experiment is feasible or not. 
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6.1 Basic material 

Let us expand Syx 

Sy, ^^E{YX*} 

= ^E{{B + C) X (A + C)*} 

= ^E{iB' + iB" + C' + iC") X [A' ~ lA" + C'- iC")} 
= ^ E { {B'A' + B"A" + B'C + B"C" + C'A' + C"A" + C'^ 
+i{B"A' - B'A" + B"C' - B'C" + C"A' - C'A") } 

and simplify the calculus by normalizing on the variances as follows 



C 



II2\ 



(15) 



Y{A} 
N{B} 
V{C} 



1 
1 

< 1 



V{A'} = 1/2 
V{B'} = 1/2 
V{C"} = 



N{A"} = 1/2 
V{B"} = 1/2 
V{C"} = kV2 



Notice that an additional factor T must be added a-posteriori for a proper 
normalization on E{5'oo} = IE{S'f,fc} = 1 (background power in 1 Hz bandwidth 
equal to one), as we did in Section [s] Thanks to energy equipartition, it follows 
that Y{A'} = 1/2 ^ Y{A'} = T/2, etc. 

The assumption that <C 1, though not necessary, is quite representative 
of actual experiments because the main virtue of the correlation method is the 
capability of extracting the DUT noise when it is lower than the background. 



Looking at (15 1, we identify the following classes 



terms 


E 


V 


PDF 


comment 


B'A', B"A", B"A', B'A" 





1/4 


Gauss 


product of zero-mean 
Gaussian processes 


B'C, B"C", C'A', C"A" , 
B"C', B'C", C"A', C'A" 







Gauss 


product of zero-mean 
Gaussian processes 


C^ + C""^ 


k2 


^4 


v = 2 


sum of zero-mean 
square Gaussian proc. 



Equation ( 15 ) can be rewritten as 

Syx = }pE{si ^l^ ^ 

where the terms 



(16) 



^ = B'A! + B"A!' + B'C + B"C" + C'A! + C'A!' 
^ = B"A' - B'A" + B"C' - B'C" + C'A' - C'A" 
<tf^C'^ + C"^ 

have the statistical properties listed underneath. Notice that follows a 

distribution with 2m degrees of freedom, thus for large m it can be approximated 
with a Gaussian distributed variable of equal average and variance, which is 
denoted with i'^^) . 
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term 


E 


V 


PDF 


comment 







1 + 2k2 
2™ 


Gauss 


average (sum) of zero-mean 
Gaussian processes 







1 + 2^2 
2m 


Gauss 






/m 


V = 2m 


average (sum) of 
chi-square processes 


(^~)™ 


«2 




Gauss 


approximates C^)^ for large m 



Next, we will analyze the properties of some useful estimators of Sy^. Run- 
ning an experiment, the logarithmic plot is comfortable because the average- 
to-deviation ratio is easily identified as the thickness of the track, independent 
of the vertical position. Yet, the logarithmic plot can only be used to display 
nonnegative quantities. 

6.2 Syx = |('S'y3;)^| 

The main reason for us to spend attention with this estimator is that it is the 

default setting for cross-spectrum measurement in most FFT analyzers. Besides, 
it can be used in conjunction with axg{Syx) ^ when the hypothesis that the 
delay of the two channels is not equal and useful information is contained in 
the argument, as it happens in radio-astronomy. | {Syx) ^ \ is of course suitable 
to logarithmic plot because it can only take nonnegative values. The relevant 
objections against this estimator are 

• There is no need to take in SjS'j^a;}, which contains half of the total 
background noise. 

• The instrument background turns into relatively large estimation bias. 
For large m, where tends to the estimator is expanded as 

I {Syx)^ I = y[^{{YX^)jf + [^{{YX^)jf 
6.2.1 The (not so) silly case of k = 

The analysis of this case tells us what happens when m is insufficient for the 
single- channel to be rejected, so that the displayed average spectrum is sub- 
stantially the bias of the estimator. Since c -H- C = 0, it holds that ^ = 0. 
Letting 

(^)™ = \l[{^)J+mJ ■ 
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we notice that is Rayleigh distributed with 2m degrees of freedom. Using 

Table [3l we find that 

mr/^N T 0-886 , , 

E{ iT = \h^ = average 



V{(irU--(l-^)=— (variance) 
m V 4/ m 

Figure [t] compares the case m — 1 (Rayleigh distribution) to the Gaussian 
distribution associated with the best estimator (Section 6.3 1. 



Interestingly, the deviation-to-average ratio, which also applies to | (Syx) |, 



dev{|(^,.)„|} _ 1 4 ^ dev 



^r, ,^ ^ M -u--l== 0.523 (17) 

is independent of m. In logarithmic scale, the cross spectrum appears as a 
strip decreasing as 51og(m) dB, yet of constant thickness of approximately 5 
dB (dev/avg). This is seen in the example of Fig. [sj 

6.2.2 Large number of averaged realizations 

The estimator converges to k^, which is trivial, and for k <C 1 the deviation- 
to-average ratio is approximately l/y'm. This issue is not further expanded 
here. 

6.3 Syx = ^ {{^yx)m} 

This is the best estimator to the extent that 

• All the useful information is in ^{Syx} = ^{^Z + '^). 
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if(x) 



negative values 




Figure 8: PDF of the estimator Sy^ = ^{{Syx)„,}- 



• Since the instrument background is equahy spht in ?Si{Syx} and Sj^ya;}, 
discarding ^jS'^^:} results in 3 dB improvement of the SNR. 

• for the same reason, the instrument background does not contribute to 
the bias. 

The main drawback is that this estimator is not suitable to logarithmic plot 
because 3? {{Syx)^} can take negative values, especially at small m. For large 
TO we can approximate with which is Gaussian distributed. Letting 



the PDF of is Gaussian (Fig.js]). Using the results of Sec. 



A.2 



E{(ir)„j 
v{(ir>„} 

dev{(ir)„} 

dev{(ir)^} 
E{(^>™} 

Pn 
Pp 



K 
1 



2k^ + 2k'* 



2to 

f2K2 



2k* 



2m 

Vl + + 2k4 
V2to 

^.2 



'2to 



/2to 



-^erfc ( — ^= — 
2 \V2a 



1 



^erfcf ^ 
2 V\/2c7 



(P{x < 0}, Sec.OI 
(P{x > 0}, Sec. IX2I) 



we find 
(18) 
(19) 

(20) 

(21) 

(22) 

(23) 



Accordingly, for k <C 1 a dB SNR requires that m — ^ . If for example the 
DUT noise is 20 dB lower than the single-channel background, thus k = 0.1, 
averaging on 5x10'^ spectra is necessary to get a SNR of dB. On the other 
hand, if k ^ 1 the deviation-to-average ratio converges to 1/ \/2to, which is 
what we expect if the instrument background is negligible. 

6.3.1 Precision vs. energy conservation 



The term \/2 in the denominator of (21| means that the SNR of the correlation 



system is 3 dB better than the single-channel system. In a physical system 
ruled by energy conservation this factor does not come for free because the 
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Figure 9: PDF of the estimator Sy^ — \^ {{Syx)„i} 
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Figure 10: PDF of the estimator obtained averaging the positive values of 

9? { Syx } • 



DUT power is equally split into two channels. The conclusion is that the factor 
a/2 in the SNR cancels with the y/2 intrinsic loss of the power splitter. So, the 
basic conservation laws of thermodynamics (or information) are not violated. 

6.4 Sy^=\^{{Sy.)j\ 

The negative values of {Syx)^ are folded up, so that Syx is always positive and 
can be plotted on a logarithmic axis. Approximating with ('^)„j for large 

TO, the estimator is expanded as 

The PDF of \'^{{S^^}\ is obtained from the PDF of \^{{Syx) ^}\ already 
studied in Section 6.3 by folding the negative-half-plane of the original PDF 



on the positive half plane. The result is shown in Fig. [9] 

6.5 Syx = ^f? { ((Sya;)^, }, averaging on the positive values 

Averaging m values of yt{Syx}, we expect to' — mPp positive values and 
m — m' — TO Pjv negative values. This estimators consists of averaging on 
the to' positive values, discarding the negative values. As usual, we assume 



theorem states that follows. Let x a random variable, f(x) its PDF, and y = |x| a 
function of x. The PDF of y is g{y) = f{y)u{y) + f{—y)u{—y), where u{y) is the Heaviside 
(step) function. Notice that the term f{^y)u{—y) is the negative-half-plane {y < 0) side of 
f{y) folded to the positive half plane. 
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Figure 11: PDF of the estimator. 
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Figure 12: Comparison of the estimators based on ^{Syx}- 



that for large m the term ('^)^ is approximated with so that its PDF is 

Gaussian. The PDF of this estimator is formecFlfrom the PDF of ^{{Syx)„^} 
after removing the negative-half-plane values and scaling up the result for the 
integral of the PDF to be equal to one. This is illustrated in Fig. [lO] 



6.6 Estimator Syx = {max{?R{Syx}, 0+))^ 

Averaging ^{Syx}, the negative values are replaced with 0+. The reason for 
using 0+ instead of just is that lima,_j.o_|_ log(x) exists, while lima;_j.o log(a;) 
does not. The notation "0+" is a nerdish replacement for the "smallest positive 
floating-point number" available in the computer. This small number is equiva- 
lent to zero for all practical purposes, but never produces a floating-point error 
in the evaluation of the logarithm. Since the negative values are replaced with 
zero, the PDF of this estimator (Fig. 11) derives from the PDF of ^{{Syx)jyJ 
replacing the negative-half-plane side with a Dirac delta function. 



6.7 Choice among the positive (biased) estimators 

Having accepted that an estimator suitable to logarithmic plot is positive, thus 
inevitably biased, the best choice is the estimator that exhibits the lowest vari- 
ance and the lowest bias. This criterion first excludes | (Syx)^ \ in favor of one 

theorem states that follows. Let f{x) the PDF of a process, and g{x) the PDF con- 
ditional to the event e. The conditional PDF is obtained in two steps. First an auxiliary 
function h{x) is obtained from f{x) by selecting the sub-domain defined by e. Second, the 
desired PDF is g{x) = h{x)/ h{x) dx. The first step generates h{x) equal to f(x), but 
taking away the portions not allowed by e. The second step scales the function h{x) up so 
that dx = 1 (probability of all possible events), thus it is a valid PSD. 
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of the estimators based on ^{{Syx)^} because ^{S'yj:} contains only the instru- 
ment background, which goes in both average (bias) and variance of | (Syx)^ \- 
Taking SjS'yj:} away, the estimator is necessarily based on ^{{Syx)„-^}- 

Then, we search for a suitable low-bias estimator with the heuristic reasoning 
shown in Figure [T2| 



It is shown in Sec. 6.3 that for large m the PDF of ^{^{Syx)^} is a Gaussian 
distribution with mean value and variance = ■ The probability 

of the events "^{{Syx) ^ < is represented in Fig. |8] as the grey area on the 
left-hand half-plane. These events have probability P/v- Using the results of 
Section |A.2[ the average of these negative events is 

/oo 1 fj 
xfN{x)dx^lJ-- y ^ —===== (Eq. ([39|) . 
ierfcf^j V27rexp(^2/f^^) 

The estimator is made positive by moving the area P/v from the left-hand half- 
plane to the right-hand half-plane. The bias depends on the shape taken by this 
area, and ultimately on the average associated to this shifted Pjv- By inspection 
on Fig. [12] we notice that 

Section 16. 5i Syx — ^{{Syx) ,} makes use only of the positive values, the 
negative values are discarded. The PSD area associated to Pn has the 
same shape of the right-hand side of the PSD. We denote the average of 
this shape with /ii. 

Section 16. 4i Syx — \^{{Syx)m}\- The shadowed area associated to Pjy is 
flipped from the negative half-plane to the positive half-plane. The aver- 
age is /i2 = -^J.N■ 

Section 16.61 Syx = ^{{max{Syx,0+)) ^}. The shadowed area associated to 
Pjv collapses into a Dirac delta function. The average is — 0. 

From the graphical construction of Fig. [T2j it is evident that 

Ail > > Ai3 • 
The obvious conclusion is that the preferred estimator is 



Syx {(max(5j^:r, 0+))„J (Preferred, Sec. |6j6]) . 



It is worth pointing out that the naif approach of just discarding the negative 



values before averaging (Sec. 6.51 turns out to be the worst choice among the 
estimators we analyzed. 

6.8 The use of 

It has been shown in Sec.[6](Eq. ( [l5| )) that all the DUT signal goes into ^{Syx}, 



and that ^{Syx} contains only the instrument background. More precisely, ( 15 ) 
is rewritten as 



Syx = + 1^ + '^} (Eq. ([Tef) 

^{Syx} = ^ E + and ^{Syx} ^ ^E{3§} 

where £/ and ^ come from the background have equal statistics, and ^ comes 
from the DUT spectrum. Therefore 
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Figure 13: Effect of the finite duration of the measurement on the spectrum. 



• ^{{Syx) ^} is a good estimator of the background 

• the contrast ^{{Syx}^^} — "^{{Syx},^} is a good indicator of the averaging 
convergence to Sec- 

7 Statistical independence on the frequency axis 

As a relevant property of white Gaussian noise, the Fourier transform is also 
Gaussian with all values on the frequency axis statistically-independent. This 
property is taken as a good representation of the reality even in the case of 
discrete spectra measured on a finite measurement time T, and used extensively 
in this report. Yet, in a strictly mathematical sense time-domain truncation 
breaks the hypothesis of statistical independence in the frequency domain. This 
happens because time truncation is equivalent to a multiplication by a rectan- 
gular pulse, which maps into a convolution by a sinc( ) function in the frequency 
domain. This concept is shown in Fig. |13[ and expanded as follows 



x{t) 
X{f) 



XT{t) = x{t)n{t/T) 

,sin(7rr/) 



XtU) = x{t) * T- 



where 



m 



1 -l/2<t<l/2 
elsewhere 



sinc(/) 



T^Tf 



sin(7r/) 
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The consequences are the fohowing. 

• The side-lobes of Tsinc(T/) cause energy leakage, thus a small correlation 
on the frequency axis. 

• Accuracy is reduced because each point collects energy from other fre- 
quencies. This may show up in the presence of high peaks (50-60Hz, for 
example) or high roll-off bumps. 

• One should question whether the number of degrees of freedom is reduced. 

The truncation function is called "window" on the front panel of analyzers, and 
sometimes "taper" in textbooks about spectral analysis. Reduced frequency 
leakage is obtained by a different choice of the truncation function, like the 
Bartlett (triangular), Hanning (cosine) or Parzen (cubic) window. 



8 Applications and experimental techniques 
8.1 PM noise 

The first application to frequency metrology was the measurement of Hydro- 
gen masers |VMV64j in the early sixties. Then, the method was used for the 
measurement of phase noise |WSGG76] in the seventies, but it found some pop- 
ularity only in the nineties, when dual-channel FFT analyzers started to be 
available. 

Figure [14] shows some of the most popular schemes for the measurement 
of phase noise. The mixer is a saturated phase-to- voltage converter in Fig. [Ml 



A-C, and a synchronous down-converter in Fig. 14 D. In all cases correlation 
is used to reject the noise of the two mixers. The background noise turns out 
to be limited by the thermal homogeneity, instead of the absolute temperature 
referred to the carrier power. This property was understood only after working 
on the scheme D jRGOOj . At that time, the other schemes were already known. 

The scheme A jWSGGTG] is suitable to the measurement of low-noise two- 
port devices, mainly passive devices showing small group delay, so that the noise 
of the reference oscillator can be rejected. 

The scheme B consists of two separate PLLs that measure separately the os- 
cillator under test. Correlation rejects the noise of the two reference oscillators. 
In this way, it is possible to measure an oscillator by comparing it to a pair of 
synthesizers, even if the noise of the synthesizers is higher than that of the os- 
cillator. This fact is relevant to the development of oscillator technology, when 
manufacturing makes it difficult to have the oscillator at the round frequency of 
the available standards, and also difficult to build two prototypes at the same 
frequency. 

The scheme C derives from A after introducing a delay in the arms |LSL84j . 
It can be implemented using either a pair of resonators or a pair a delay lines. 
The use of the optical-fiber delay line is the most promising solution because 
the delay line can be adapted to the arbitrary frequency of the oscillator under 
test, while a resonator can not |RSHM05| . Correlation removes the fluctuations 
of the delay line |SYMR041 ISCJ+07| . 

The scheme D is based on a bridge that nulls the carrier before amplification 
and synchronous detection of the noise sidebands. This scheme derives from the 
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Figure 14: Basics schemes for the measurement of phase noise. 



pioneering work of Sann |San68j . At that time, the mixer was used to down 
convert the fluctuation of the nuU at the output of a magic Tee. AmpUfication of 
the noise sideband [Lab82^ and correlation ftGOOj were introduced afterwards. 

With modern RF/microwave components, isolation between the two chan- 
nels may not be a serious problem. The hardware sensitivity is limited environ- 
mental effects, like temperature fluctuations and low-frequency magnetic fields, 
and by the AM noise. The latter is taken in through the sensitivity of the mixer 
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Figure 15: Basics schemes for the measurement of amphtude noise (from 
|R,ub05) V 



offset to the input power. Only partial solutions are available |RB07) . 
8.2 AM noise 

Figure [15] shows some schemes for the cross spectrum measurement of AM noise, 
taken from |Rub05) . 
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Figure 16: Example of cross spectrum measurement (amplitude noise of an 
oven-controlled quartz oscillator), taken from (RubOSj . 

In Fig. [15] A, two Schottky-diode or tunnel-diode passive power-detectors are 
used to measure simultaneously the power fluctuations of the source under test. 
Isolation between channels is guaranteed by the isolation of the power splitter 
(18-20 dB) and by the fact that the power detectors do not send noise back to 
the input. Correlation enables the rejection the single-channel noise. 

As an example, Fig. [16] shows the measurement of a quartz oscillator. Con- 
verting the 1// noise into stability of the fractional amplitude a, we get CTa^r) = 
4.3xl0~^ (Allan deviation, constant vs. the measurement time r). This oscilla- 
tor exhibits the lowest AM noise measured in our laboratory. The single-channel 
noise rejection achieved by correlation and averaging is more than 10 dB. 

Figure [15] B is the obvious adaptation of the scheme A to the measurement 
of the laser relative intensity noise (RIN). We start using it routinely. 

The scheme of Fig. [15] C, presently under study, is intended for the measure- 
ment of the microwave AM noise on the modulated light beam at the output of 
new generation of opto-electronic oscillators based on optical fibers |YM96) . or 
based on whispering-gallery optical resonators. 

8.2.1 Single-chanel vs. dual-channel measurements 

In the measurement of PM noise it is more or less possible to test the background 
of a single-channel instrument by removing the DUT. This happens because we 
can always get the two phase-detector from a single oscillator, which is the phase 
reference]^ The correlation schemes are more complex than the single-channel 
counterparts, and sometimes difficult to operate. Obviously, the experimentalist 
prefers the single-channel measurements and uses the correlation schemes only 
when the sensitivity of the former is insufficient. 

Conversely, the measurement of AM noise relies upon the power detector, 
which does not work without the source. Thus we cannot remove the device 
tmder test, and of course we cannot asses the single-channel background noise 

*This statement of course applies only to the background noise of the instrument. When 
the instrument is used to measure an oscillator we need a reference oscillator, the noise of 
which must be validated separately. 
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Figure 17: Measurement of the background noise of a power detector. 



of the instrument in this way. One can object that even in the case of PM noise 
we can not measure an oscihator in single-channel mode if we do not have a 
low-noise reference oscillator. The difference is that in the case of PM noise we 
can at least validate the instrument, while in the case of AM noise we can not. 

Another difference between AM and PM is that the phase detector is always 
more or less sensitive to AM noise |RB07| . while the amplitude detector is not 
sensitive to phase noise. In correlation systems, this fact makes the channel 
separation simple to achieve and to test. 

The conclusion is that the cross-spectrum measurement is inherently simpler 
with AM noise than with PM noise. 

8.3 Other applications 

Tracking back through the literature, the first use of the cross-spectrum was 
for the determination of the angular size of stellar radio sources jHB JDG52] . 
In the case of a signal coming through two antennas separated by an appropri- 
ate baseline, the latter introduces a delay depending on the source direction in 
space. Hence the useful signal Sec cannot be real. Instead, the angle arctan5/3? 
gives information on the source direction. The very-large-baseline interferome- 
try (VLBI) can be seen as a generalization of this method. 

When the same method was applied to the intensity interferometer |HBT56al 
IHB T56b]. an anti-correlation effect was discovered, due to the discrete nature of 
Hght. This phenomenon, known as Hanbury Brown - Twiss effect (HBT effect), 
was later observed also in microwave signals in photonic regime |GRF"'"04] . i.e., 
with hv > kT. 

The correlation method finds another obvious application in radiometry 
[A1162j ■ and of course in Johnson thermometry, which is often considered a 
branch of radiometry. 

Since the cross-spectrum enables to compare the PSD of two noise sources. 
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it can be used to measure a temperature by comparing thermal noise to a 
reference shot noise. The latter is in turn measured as a dc value by exploiting 
the property of Poisson processes that the variance can be calculated from the 
average. In a Tunnel junction, theory predicts the amount of shot and thermal 
noise. This fact can be exploited for precision thermometry |SLSS03I . and 
ultimately to redefine the temperature in terms of fundamental constants. 

The measurement of the low 1// voltage fluctuations is an important diag- 
nostic tool in semiconductor technology. The field-effect transistors are suitable 
to this task because of the low bias current at the input. In fact, the bias 
current flowing into the sample turns into a fully correlated voltage through 
the Ohm law. Additionally, the electrode capacitance may limit the instrument 
sensitivity. The reader can refer to |SFF99) for a detailed treatise. 

In metallurgy, the cross spectrum method has been used for the measure- 
ment of electromigration in thin metal films through the 1// fluctuation of the 
conductor resistance. This is relevant in microprocessor technology because the 
high current density in metal connexions can limit the life of the component and 
make it unreliable. For this reason. Aluminum is no longer used. The high sen- 
sitivity is based on the idea that with white Gaussian noise X' and X" (real and 
imaginary part) are statistically independent. Synchronously detecting the sig- 
nal with two orthogonal references, it is therefore possible to reject the amplifier 
noise even if a single amplifier is shared by the two channel |VSHK89] . Adapting 
this idea to RF and microwaves is straightforward |RG02| . Unfortunately, we 
still have no application for this. 

A Mathematical background 

A.l Random variables and density functions 

Let X a random variable and x a variable. Denoting with P{e} the probability 
of the event e, two relevant probability functions are associated with x and 
X, namely the cumulative density function F(x) and the probability density 
function f{x). They are defined as 

F{x) = P{x < x} (cumulative density function, or CDF) (24) 

f{x) dx — P{x < X < X + dx} (probability density function, or PDF) . (25) 

CDF and PDF are related by 




(26) 



The probability that x is in the interval [a, b] is 




(27) 



The probability that x takes any value is equal to one, thus 



'CO 




and 



f{x) dx = 1 . 



(28) 



— OO 
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Figure 18: Gaussian (normal) PDF. 



The average and the variance of the random variable x are 
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A. 2 Gaussian (normal) distribution (Fig. 18) 

The Gaussian (normal) distribution has the following main properties 

1 



27rcr 



E{x} ^ fj. 

V{x} = cr^ 

Pn = ^crfc 
Pp = 1 - -erfc 



exp 



/2 a 



2"" \V2a, 
A new PDF is associated to the positive events 
1 



Gaussian PDF) 
average) 
variance) 
P{x < 0}) . 

P{x > 0}) 



lip 



X fp{x) dx — fl 



1 



l-ierfcf^) V2vrexp(/i2/f72) 



(31) 
(32) 
(33) 
(34) 

(35) 



(36) 
(37) 
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Similarly, another PDF is associated to the negative events 



-nv 



f°° 1 

= / X f^ix) dx = n- J 

J-oo ierfcl 



^2 c 



■\/27r exp(/i2/cr2 



The following integrals related to the Gaussian PDF are useful 



rf.= ^erfc(^ 
f{x) dx = 1 — ^erfc^ 
xf{x) dx = /i^erfc^-^^ 



/ 

/° 

^0 



•\/27rexp(/i2/(T2) 



X f{x) dx = fj, 



2 K^aJ 



+ 



(38) 
(39) 

(40) 
(41) 
(42) 
(43) 



A. 2.1 Sum of zero-mean Gaussian variables 

Let Xi [t) and X2 {t) two random functions with Gaussian distribution, zero mean 
and variance a\ and The sum x(t) — xi(t) + X2(t) is a random function 
with Gaussian distribution, zero mean and variance cr^ = a\+ (J2- 



A. 2. 2 Sum of a nonzero-mean and a zero-mean Gaussian variable 

Let xi (t) and X2 (t) two random functions with Gaussian distribution, and mean 
and variance /Ui 7^ 0, af, ^2 = 0, and (t|. The sum x(t) = Xi(t) + X2(f) 
is a random function with Gaussian distribution, mean /j, = ni and variance 
(7^ = cr^ + af. 



A. 2. 3 Product of zero-mean Gaussian variables 

Let Xi(t) and X2(i) two random functions with Gaussian distribution, zero mean 
and variance and (T2 . The product x = xi (t) X2 (t) is a random function with 
gaussian distribution, zero mean and variance cr^ = erf (tI- 



A. 2. 4 Fourier transform of a Gaussian variable 

Let x{t) a random process with Gaussian distribution and white spectrum, and 
x{t) a realization. The Fourier transform X{f) = X'{f) + iX"{f) is a random 
process with white spectrum and zero-mean Gaussian distribution. This means 

that 

1. At any frequency, the real part X'{f) and the imaginary part X"{f) are 
random variables statistically independent with equal variance. 

2. Given two frequencies /i and /2 (or two separate frequency intervals), 
X{fi) and X{f2) are statistically independent. 

Interestingly, \X\ = ^ {X'Y + {X"Y has Rayleigh distribution, and \X^ = 
{X')"^ + {X")^ has distribution with 2 degrees of freedom. 
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A. 2. 5 Discrete zero-mean Gaussian-distributed white noise 

It is often convenient to use the discrete Fourier transform and spectra. Thus 
we refer to 

sif) ^ s,, = ^(xi^ + x^) 

where the subscript i denotes the z-th reahzation and the subscript j denotes 
the discrete frequency. The following properties hold for zero-mean white noise 
with Gaussian distribution. 

1. Xii is zero-mean Gausian distributed. Thus X', and X", are zero-mean 
Gaussian processes. 

2. Different frequency. 

• Xij and Xik, j ^ k, are statistically independent. 

• N{Xij} =N{Xik} (energy equipartition) . 

3. Real and imaginary part. 

• X[j and X'-j are statistically independent. 

• E{X;^-} = 0, and ¥.{X'lj} = (zero mean). 

• V{X^} =N{X'^j} = \^{Xij} (energy equipartition). 

4. Absolute square value |A,jf = \X[^\'^ + \X'l^\^ . Letting V{A,j} = cr^ 

• |A^y P has distribution with two degrees of freedom. 

• E{|A„|2} = (average). 

• V{|A,jf } 0-4 (variance). 

5. Sum of two independent processes, y = xi + X2 O F = Ai -|- Xi. 

• lij is Gaussian distributed 

. V{y,,}=V{Ai,,}+V{A2,,}. 

6. Product of two independent processes, y = X1X2 o y = Ai * A2. 

• Yij is Gaussian distributed 

• = V{Ai,,} + V{A2,,}. 

A. 3 Chi-square distribution (Table [2]) 

Let xi , X2 , . . . x^ a set of normal-distributed random variables with zero mean 
and variance equal one, and 

X^=Ex? (44) 

i=l 
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a new function called 'chi-squarc' distribution with z/ degrees of freedom. The 
probability functions associated to x = and the relevant parameters are 

f{x) = ^ x>0 (chi-square PDF) (45) 

Fix) = 1 - ^1|^M = 2ll^ (chi-square CDF) (46) 

E{x} = v (average) (47) 

E{x^} = v{v + 2) (2nd moment) (48) 

E{|x - E{x}|2} = 2v (variance) . (49) 

It follows immediately from the definition of that the sum of n random 
variables with distribution and Vj degrees of freedom is distributed 

n n 

In the general case, the variance of xi . . . x^^ is ct^ 7^ 1. This is solved with 
the transformation x — x/cr^. Thus f{x) = -\ [fix)]^^^^^^, and Cramer 

p. 236 



fix) = 


CTT(iz/)23^ 


(chi-square PDF) 


(50) 


E{x} = 




(average) 


(51) 


E{x2} 


= a^u{u + 2) 


(2nd moment) 


(52) 


E{|x- 


E{x}|2} = 2a*!/ 


(variance) . 


(53) 



A.4 Rayleigh distribution 

Let xi and X2, two independent random functions with Gaussian distribution, 
zero mean and equal variance a, and 

x=^xf+xi (54) 
a new random function. This function has Rayleigh probability density function Checked 

fix) = ^ exp (-^) , y > 



TT 

2" 



E{x} = 
E{x^} = 2(7^ 

V{x}=E{|x-E{x}p} = i^a2 

The functions Xi(i) and X2(i) can be interpreted as the random amplitude of 
two orthogonal vectors, or the real and imaginary part of a complex random 



(Rayleigh PDF) 


(55) 


(average) 


(56) 


(2nd moment) 


(57) 


(variance) . 


(58) 
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function. Following this interpretation, x(t) is the absolute value of the vector 
sum. Table [3] reports some useful numerical values related to the cr^ = 1/2 
Rayleigh distribution. 
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Table 3: Relevant values for the Rayleigh distribution. 



Rayleigh distribution with cr^ — 1/2 


quantity 
with = 1/2 


value 
[101og(),dB] 


average ~ ]J ^ 


0.886 
[-0.525] 


. ■ ■ 1 ^ 

deviation = i / 1 

V 4 


0.463 
[-3.34] 


dev / 4 ^ 

avg V TT 


0.523 

[-2.8] 


avg + dev _ _^ ^ / 4 ^ 
avg V TT 


1.523 
[+1.83] 


avg — dev ^ / 4 ^ 
avg V TT 


0.477 

[-3.21] 



A case of interest in averaged measurement is = l/2m, which yields 
~ 0.886 



E{x} 




m m 



E{x2} = - 
m 

V{x} = E{|x-E{x}P}=(l-^) ^ 
V m 

yV{x} _ 



7r\ 0.463 



E{x} 



(average) 



(variance) 
(deviation) 



1 = 0.523 (independent of m) (dev/avg) 



(59) 



(2nd moment) (60) 



(61) 
(62) 

(63) 



B A short introduction to AM and PM noise 

Phase noise (PM noise) is a well established subject, clearly explained in numer- 
ous classical references, among which we prefer [Rut 78 [ IKim97[ ICCI90[ |Vig99| 
and [V A89. vol. 1, chap. 2]. Amplitude noise (AM noise), far less studied than 
PM noise, is described in similar manner. Refer to |Rub05| for a general intro- 
duction to AM noise. Only a brief introduction to AM/PM noise is given here, 
aimed at recalling the vocabulary. 

The quasi-perfect sinusoidal signal of frequency vq, of random amplitude 
fluctuation a{t), and of random phase fluctuation ip{t) is 



v{t) = [1 + a{t)] cos pTTi^ot + tp{t)] 



(64) 
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Figure 19: Power law model for S^{f) (from |RSHM05j V 



We may need that \a{t)\ <^ 1 and that \(p{t)\ ^ 1 or |(/3(i)| <C 1 dm'mg the 
measurement. 

B.l Spectral representation of PM noise 

Phase noise is generally reported in terms of the PSD (power spectral density) 
Sip{f). In experiments, the single-sided PSD S^{f) is preferred to the two-sided 
PSD S^{f) because the negative frequencies are redundant for real signals. 
Complex or imaginary signals do not exist in this context. Thus, energy con- 
servation requires that S^{f) — 2S^ (f) for / > 0. Since now, we use Sip{f) as 
the single-sided PSD, dropping the superscript 

A model that has been found useful to describe accurately the phase noise 



of oscillator and components is the power law, shown in Fig. 19 



n=— 4 

This model relies on the fact that white (/") and flicker (1//) noises exist per- 
se, and that phase integration (xl//^) is present in oscillators. If needed, the 
model can be extended to steeper processes, that is, n < —4. 

When frequency noise (FM noise) is preferred to phase noise, the fractional 
frequency fluctuation y{t) ~ ^p{t) /2t:vq is probably the most useful quantity. 
Using the power law, the spectrum Sy{f) is written as 

Syif) = ^ 5^(/) = E '^'Z" ■ (66) 
B.2 Spectral representation of AM noise 

Amplitude noise is described in the same way of phase noise or frequency noise, 
and for the same reasons we use the power law 





Saif) ^ E (power law) 



(67) 
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Yet, the set of processes found in practice is often limited to white and flicker 
noise, and to random walk. Steeper processes (n < —2), when present, tend 
to be confined to a limited region of the spectrum. They vanish at very low 
frequencies, otherwise the amplitude would diverge rapidly. Notice that we use 
the coefficients hi as for FM noise instead of the 6j used with PM noise. The 
reason is that the formulae for the Allan variance (see below) are formally equal. 



B.3 Two-sample (Allan) veiriance 

Another tool often used is the Allan variance o-yir) = E{|?/j,_|_]^ — E{yj,}p}, 
where y^, is the average of y{t) over the k-th contiguous time slot of duration 
r, spanning from fcr to {k + 1)t. For the most useful frequency-noise processes, 
the relation between cr^(T) and Sy{f) is 



ho 
2t 

21n(2) flicker of frequency 



white frequency noise 

(68) 



h-2 - — ^ T random walk of frequency 



(other phenomena, if any) 
Similarly, letting cr^(r) = E{|afc_|_i — E{5fe}p}, the AM-noise variance is 

f^'W = ?° +^-i21n(2) + /i_2^^T + ... (69) 



2t ' ' 6 
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